What is to be done when different groups in the population exhibit very different properties and relationships between properties? What errors might we make when we aggregate all the data together and analyse it as one? And how do we avoid these errors if we are not sure which groups are we to consider together and which should we consider apart?

This is an applied course in how to extend the linear regression models in a way that would be more amenable to causal analysis.

Aim:

Method

The course is designed to follow the flipped-classroom approach. Readings and videos of the lectures are available before the course begins. You will be expected to read the relevant chapters in the textbook, watch the videos, and comment on them in Perusall PRIOR to attending the course. During the course itself, you will work on quizzes and lab assignments, which you may then submit at the end of each day.

Prepare

To prepare for the course, please complete the assignments, all available on the Perusall platform. You can find the assignments in Nestor under Assignments. You then need to click on the title of the item: CLICK ME to access Perusall.

Completing the assignment involves watching the video and reading the text before the assigned deadlines. Completing the assignments will contribute towards your attendance and participation in the course.

Grading

Resources

Online (Nestor, Perusall, github): lectures, presentations, data-sets and exercises and background literature

You will find the slides and the datasets in this shared driver. You will need to first log into your rug.nl account in order to access the files.

Further information can be found in the textbook (Chapters available through a link on nestor): Julian J. Faraway. Extending the linear model with R. Generalized Linear, Mixed Effects and Nonparametric Regression Models.

… which was used as the framework for this course. Additional material (data, updates, errata) can be found in the following webpage.

To brush up on your R programming skills, there are a variety of free resources you can use, such as the SICSS bootcamp, Harvard’s famous R Basics course, or Stanford’s R Programming fundamentals. If you speak German, this site is quite nice.

You may work and submit your assignments on any other statistical software including SPSS, STATA, Python or Julia. However, as R is becoming the lingua franca of data science in many academic circles, this is going to be the default technology used in this course.